A Law of Large Numbers for Weighted Majority 
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Abstract 

Consider an election between two candidates in which the voters' choices are random and 
independent and the probabihty of a voter choosing the first candidate is j> > 1/2. Condorcet's 
Jury Theorem which he derived from the weak law of large numbers asserts that if the number 
of voters tends to infinity then the probability that the first candidate will be elected tends to 
one. The notion of influence of a voter or its voting power is relevant for extensions of the weak 
law of large numbers for voting rules which are more general than simple majority. In this paper 
we point out two different ways to extend the classical notions of voting power and influences to 
arbitrary probability distributions. The extension relevant to us is the "effect" of a voter, which 
is a weighted version of the correlation between the voter's vote and the election's outcomes. 
Wc prove an extension of the weak law of large numbers to weighted majority games when all 
individual effects arc small and show that this result does not apply to any voting rule which is 
not based on weighted majority. 

Keywords: Law of large numbers, voting power, influences. Boolean functions, monotone 
simple games, aggregation of informations, the voting paradox. 



1 Introduction 

Consider a biased coin for which the probability for a "head" is p > 1/2. The weak law of large 
numbers asserts that if you flip the coin n times then the probability that you will see more heads 
than tails tends to one as n tends to oo. Understanding the scope of the weak law of large numbers 
when the coin flips are not independent or when we consider more complicated events than the 
event "to see more heads than tails ", has attracted considerable attention. 

Our motivation came from a game theoretic interpretation: Condorcet's Jury Theorem (see |13j ) 
asserts that in an election between two candidates, say Alice and Bob, if every voter votes for Alice 
with probabihty p > 1/2 and for Bob with probabihty 1 — p and if these votes are independent, 
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then as the number of voters tends to infinity the probabihty that Ahce will be elected tends to 
one. Condorcet's Jury theorem can be interpreted as saying that even if agents receive very poor 
(independent) signals indicating which decision is correct, majority voting will nevertheless result 
in the correct decision being taken with a high probability if there are enough agents (and each 
agent votes according to the signal he receives). This phenomenon is referred to as asymptotically 
complete aggregation of information and it plays an important role in theoretical economics. 

To describe a more general settings consider the following framework. Let / : {0, 1}" -^ {0, 1} 
be a Boolean function. We will assume that / is 

• monotone non-decreasing, i.e., 

(Vz : Xi>yi) =^ f{xi,...,Xn)> f{yi,...,yn), 

• and anti-symmetric, i.e., 

/(I - Xi, 1 - X2, . . . , 1 - X„) = 1 - f{xi,X2, . . . , Xn). 

Let fip denote the product probability measure on {0, 1}" defined by 

^ip{Xl,X2, ■■■,Xn)= P^'(l - P)"~^ (1) 

where k = xi-|-3;2 + - • •+x„. We would like to find conditions that guarantee that for a fixed p > 1/2, 
fj-pif) is close to 1. Clearly it is not sufficient that n is large since even if / is defined on many 
variables, it may actually depend only on a few of them. The notion of influence of a variable which 
is closely related to notions of voting power is important in understanding information aggregation 
when we consider general Boolean functions and the product probability measure fip. Boolean 
functions can describe voting rules and are referred to in the game theoretic literature as simple 
games. Anti-symmetric Boolean functions are called strong simple games. 

For a Boolean function / and x = (xi,3;2, . . . ,Xn) S {0, 1}" we say that the /c'th variable is 
pivotal for f if /(xi, . . . , a;fc_i,0,Xfc+i, . . . ,x„) / f{xi, . . . ,Xk-i,l,Xk+i, ■ ■ ■ ,Xn). 

Let fi be an arbitrary probability distribution on {0, 1}"" and let / be a monotone Boolean 
function that we consider as a voting rule. Define the influence or the voting power of of the /c'th 
variable as the probability that the fc'th variable is pivotal. Denote by /^(/) the influence of the 
A;'th variable for the Boolean function /, w.r.t. the distribution /i. In other words, 

^kif) = l^[{xi,...,Xn) ■■ f{xi, . . . ,Xk-l,0,Xk+l, . . . ,Xn) / /(xi, . . . , Xfc.i, 1, Xfc+l, . . . , Xn)] . (2) 

The notion of influence is closely related to classical notions of voting powers. The Banzhaf 
power index of /c in / is /^^ (/) and the Shapley-Shubik power index of A; in / is, by a theorem of 
Owen PU], Jq lj^^{f)dp. In |3] the authors proposed to define the voting power as the probability to 
be pivotal based on realistic assumptions on individual voting distributions, and discuss advantages 
and drawbacks of this approach. 

For product probability spaces, results of Russo, Talagrand, Friedgut and Kalai assert that for 
every p > 1/2 sufficiently small influences suffice to guarantee that f^p{f) is close to 1. The latest 
such result is the following. 



Theorem 1.1 (Kalai, HJ). Let f he a monotone antisymmetric Boolean function. For every 
p > 1/2 and e there is 6 such that if IfJ'{f) < 5 for every k then fj,p{f) > 1 — e. 

Remark. The conclusion of Theorem 11.11 remains vahd if we replace I^'^{f) by the Banzhaf power 
index of A; in / or by the Shapley-Shubik power index. For the Shapley-Shubik power index a 
reverse implication also holds, see |3]. We choose here a version which relies only on a single 
probability distribution jip and hence is more convenient for extensions to arbitrary probability 
distributions. 

The purpose of this paper is to study extensions of the weak law of large numbers in the context 
of general probability distributions. Let ^u be a probability distribution on {0,1}". When /i is 
not a product measure the notion of influence can be extended in different way compared to the 
above. Define the effect of the fc'th variable on the Boolean function / as the difference between 
the expected value of /(xi, . . . , x„) conditioned on x^ = 1 and the expected value of /(xi, . . . , Xn) 
conditioned on X]^ = 0, and denote by e^(/) the effect of the /c'th variable for the Boolean function 
/, w.r.t. the distribution yU. More precisely, 

e^(/) = m[/(Xi, . . .,Xn)\Xk = 1] - /i[/(Xi, . . .,Xn)\Xk = 0]. (3) 

The effect is undefined if the probability for Xk = 1 is 1 or 0. Writing fi[Xk] = p and Y^ = X^ — p, 
we get 

Co^^[f{Xi,...,Xn),Xk] = l^[f{X,,...,Xn)Yk] 

= P/i[(l - p)f\Xk = 1] + (1 - p)f^[-pf\Xk = 0] 

= p(l-p)<(/) 

so that the effect may be interpreted as a normalized form of the correlation between the individual 
vote and the election's outcome. 

When fi represent a product probability measure (^, the effect Q and the influence ^ coincide, 
but in general this is not the case. For instance, for general fi the effect may be negative (see item 
(i) in Section ^ while the influence is of course always non-negative. 

It is not true that for general probability distributions and general /, small influences implies 
aggregation of information. Our main result is that small effects implies aggregation of information 
for the particular case of weighted majority functions. Moreover, unlike in Theorem ll.H the bounds 
in our main result are rather realistic. 

We call monotone antisymmetric function / a weighted majority function if there exists non- 
negative weights wi, . . . , Wn, not all zero such that f{xi, . . . , Xn) = 1 if Yll=i ''J^ii'^^i — 1) > and 
and /(xi, . . . , Xn) = if J27=i ''^ii'^^i — 1) < 0. If n is odd and Wi = 1 for every i, f is called the 
majority function (or simple majority). 

Note that in our definition of a weighted majority function, if '^Wi{2xi — 1) = then the 
value of f{x) may be either or 1 as long as / is monotone and anti-symmetric. This is different 
from the traditional definition of a weighted majority (or threshold) function where f{x) = 1 iff 
Y.Wi{2xi - 1) > and f{x) = iSY.M'^Xi - 1) < 0. 

Thus for example, any monotone anti-symmetric function / : {0, 1}" — > {0, 1} satisfying f{x) = 
1 when xi = X2 = 1 and /(x) = when xi = X2 = is a weighted majority function (taking 
wi = ui2 = I and ws = ■ ■ ■ = Wn = 0) according to our definition. 



The above example demonstrates that under our definition of weighted majority functions, 
there are at least 2^ weighted majority functions. Under the traditional definition the number 
of weighted majority functions is at most 2" |nill2j. 

Of particular interest are voting schemes where all the voters have the same power. One such 
case is when / is invariant under a transitive group of permutations. In other words there exists a 
group of permutation F C 5'„, such that /(xi, . . . , Xn) = f{x^(^i), . . . , Xo-(„)) for all o" G T and for all 
^ ^ i,j 1^ n there exists cr € F such that a{i) = j; here Sn denotes the full permutation group on n 
elements. One instructive example is the simple majority function when n is odd which is invariant 
under Sn] another is the recursive majority function RMj^^^ which is defined for n = k^ where k is 
odd. The definition is by induction. RM/^i is just the majority function on k bits and 

RMk/+i{xi, . . . ,Xfc«+i) = RMk,i {RMk,i{xi, . . . ,Xi,e), . . . ,RMk/{xf,e^,,e-i^i, . . . ,Xf,e)) . 

See Figure m 
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Figure 1: The function RM^^2 



Theorem 1.2. 



(a) For every p > ^ , e > there is 6 = 6{p, e) > such that for every weighted majority function f 

and any distribution fi on {0, 1}", if e'j^lf] < S and fi[Xk = 1] > p for all k then n[f] > 1 — e. 

In other words, if the effect of each variable is at most 5 and the probability that each variable 
is 1 is at least p, then / = 1 with ^-probability at least 1 — e. 

(b) If f is a monotone anti- symmetric function but not a weighted majority function, then there 

exists a probability distribution fi such that /i[Xfc = 1] > 1/2 for all k, yet /i[/] = and 
e^(/) = for all k. 

In other words, if f is not a weighted majority function, then there is a probability measure 
^ for which / = with ^-probability 1, yet /u[Xfc = 1] > 2 for all k. (Since f is constant 
according to the measure ^, all the effects are in this case.) 

(c) If f is monotone anti- symmetric and invariant under a transitive group, but is not the (simple) 

majority function, then then there exists a probability distribution n such that n[Xk = 1] > 1/2 
for all k, yet pL[f] = and e'^^if) = for all k. 

The rest of this paper is organized as follows. In Section |21 we will discuss the notions of 
aggregation of information, influences and effects for general probability distributions on {0,1}". 



We will try to examine what aggregation of information means when we do not suppose that the 
probability distribution for the voter's behavior is a product distribution. We also examine to what 
extent our technical notion of "effects" represent real influence in the non-technical sense of the 
words. Section EJ contains the proof of our theorem and in Section |1] we present several natural 
problems as well as an example showing that Theorem 11.11 does not extend to arbitrary Boolean 
monotone functions even for the restricted class of FKG-distributions. Finally, in Section [21 we 
present an alternative proof of Theorem II. 21 (a) that yields sharper quantitative bounds. 

2 Voting games, information aggregation and notions of influence 

Consider the following scenario. Every agent k receives a single bit of information Sj which is either 
'Vote for Alice' or 'Vote for Bob' and these signals are independent. When Alice is the better 
candidate the probability of receiving the signal 'Vote for Alice' is p > 1/2. Condorcet's Jury 
Theorem deals with the case that the voters vote precisely as the signal dictates and the decision is 
made according to the simple majority rule. It asserts that for every p > 1/2 the better candidate 
will be elected with probability tending to one. Thus the majority rule allows to reveal the actual 
state of the world from rather weak individual signals. 

A major problem in the economic and political interpretation of Condorcet's Jury Theorem and 
its extensions arises from the fact that the basic assumption of probability independence among 
voters is quite unrealistic. Without the assumption of independence, Condorcet's Jury Theorem as 
stated is no longer true, and it will no longer be the case that when each individual votes for Alice 
with probability p > 1/2, Alice will win with a high probability. 

To see this, consider the following example. As before, we have an election between Alice and 
Bob and Alice is the superior candidate. The distribution of signals si,S2, ■ ■ ■ ,Sn will be biased 
towards Alice as follows: Let p = 1/2 + e/2, where e is small. First choose at random a number 
t uniformly in the interval [e,l]. Then, independently for each i, choose the i'th voter signal Si 
to be '1' with probability t and '0' with probability 1 — t. Voters with Sj = 1 will vote for Alice. 
In this case, the probability for each individual signal Sj being '1' is p but the individual signals 
are not independent. The probability that Alice will win is below ^q,^-) for any number of voters. 
This is because we can think of t being chosen in two stages. First we toss a coin which is 'H' 
with probability e/(l — e). If the coin is 'H', then t is chosen uniformly in the interval [1 — e, 1]. 
This contributes to the probability that Alice wins at most e/(l — e). If the coin is 'T' then t is 
chosen uniformly in the interval [e, 1 — e]. Here by symmetry, Alice and Bob have the same chance 
of winning. Thus the contribution to the probability that Alice will win from this case is wrr^- 

Thus the overall probability that Alice will win is at most 2(i~l) '^ T^ ~ 2(1-6') • 

An even more extreme example is the case in which all voters vote in the same way: With 
probability p they all vote for Alice and with probability 1 — p they all vote for Bob. Alice will be 
elected with probability p regardless of the number of voters when the election is based on simple 
majority and for every other simple game. 

These simple examples will help us to examine the notions of information aggregation and 
influence in the case when the assumption of probability independence is dropped. The problem 
in these examples is not in the way information aggregates but in the quality of the information to 



start with. This assertion can be formahzed as follows. Suppose that Alice and Bob are given an 
a-priori probability 1/2 of being the superior candidate. We assume that the distribution of voters 
for Bob given Bob is the superior candidate is the same as the distribution of voters for Alice given 
that Alice is the superior candidate. Thus in the first example above if Bob is superior then we 
choose t uniformly in [e, 1] and then each voter votes for Bob independently with probability t. In 
the second example if Bob is superior, then all voters will vote for Bob with probability p. 

We now wish to decide between the hypothesis that Alice is the superior and the hypothesis 
that Bob is the superior candidate given the entire vector of individual signals. It is intuitively 
clear and easy to prove using the Neyman-Pearson Lemma that in both cases described above one 
should guess that Alice is superior to Bob exactly when the majority of voters voted for Alice. 
However, in both examples above the probability that the majority will vote Alice when Alice is 
superior is bounded away from 1 and tends to 1/2 as p does. 

When we consider general distributions, the issue is to understand what information we can 
derive on the superior alternative from knowing the signals of all individuals and how the voting 
mechanism extracts this information. Note that in the examples we considered above the individual 
effects are large while the individual influences are small. This is most transparent in the second 
example where if / is the majority function and n > 3, then all of the influences are 0, while the 
effect of all voters are 1. Theorem ll.2l asserts that for the weighted majority voting rule (and only for 
these rules) for every probability measure on {0, 1}", small individual effects implies asymptotically 
complete aggregation of information. 

Let us next consider the notion of influence without probability independence. The notion of 
pivotal variables (or players) and influence is of important technical importance in various areas 
of mathematics, computer science and economics. This notion is also of a considerable conceptual 
importance. The voting power index of Banzhaf is based on measuring the influence with respect 
to the uniform distribution. The Shapley-Shubik power index can also be based on the influence 
with respect to another distribution. Conceptual understanding of voting power in situations where 
the voters' behavior is not independent is of great interest. In [3j the authors propose to define the 
voting power as the probability to be pivotal based on realistic assumptions on individual voting 
distributions. We make the following remarks on the notion of individual effects which is quite a 
different extension of influence and voting power measures to arbitrary probability distributions. 

(i) For general distributions, the effect of an agent can be negative. This will be the case for a 
voter who always votes for the candidate who is the underdog in the election polls and also for 
a committee member who antagonizes the other members of the committee. (On the other 
hand, the influence of an agent is always nonegative, because it is defined as a represents a 
probability.) 

(ii) A dummy (a voter k which is never pivotal) has zero influence (with respect to every probability 
distribution). He may nevertheless have a large effect, such as if he always votes for the 
candidate who is expected to win according to election polls. In real life, this will also be the 
case for an observer on a committee without the right to vote but who is likely to convince 
the committee of his opinion. Note that in the first case we do not attribute to that player 
real "influence" in the (non-technical) English sense of the word, while in the second case we 
would consider him "influential". The uncertainty in interpreting effects as real "influences" 



IS genuine. 

(iii) What is the motivation for a voter to vote, given the small probability for him to be pivotal? 
This is a social dilemma, related to, e.g., the so-called tragedy of the commons, and has been 
extensively discussed in the political science and philosophy literature. (Sometimes the term 
voting paradox has been used for this dilemma, but may cause some confusion as the same 
term is used also for Condorcet's famous observation that when three or more choices are 
available, the majority preference between them need not be transitive.) A possible solution 
to the dilemma may lie in the fact that in real-life elections, individual effects tend to be 
large, namely bounded away from zero regardless of the size of the society. The uncertainty 
in regarding effects as real "influence" may suggest that it is the effect of an agent rather 
than his influence which is related to his "satisfaction" with the social decision process and 
his ability to identify with the collective choice. 

3 Proof of Theorem O] 

3.1 Part (a) of the theorem 

We begin this section by providing a probabilistic proof of the following result, which clearly implies 
Theorem 11.21 (a) . 

Lemma 3.1. Let {wi)"^^^ be non-negative weights which are not all 0, let < q < 1, and let 
f : {0, 1}" — s- {0, 1} be a function which satisfies 

10 if EtiC^Xi - ^Q)w^ < 0. 

Write W = X]r=i ^«- Suppose furthermore that p > q and that fj, is a probability measure satisfying 
^t[Xi] = Pi and 



^ WiPi = pW (4) 



i=l 



as well as 



Then 



Y, WiPiil - Pi)ef[f] < p{l - p)6W. (5) 

6p{l -p) 



i=l 



;"[/] > 1 



p-q 



(Note that © holds if ^i[Xi] = p for all i, and that © holds if ;u[Xj] = p and ef [/] < 5 for all 
i, so that indeed Theorem 11.21 fa) follows.) 

Proof. Let X = Y.t=i{'^^i - '^Q)wi. We start by noting that fi[X] = {2p - 2q)W . 
We let 5 = 1 — / and Yi = pi — Xi, so that 

li[Yig] = Cov^[/,Xi] =p,(l -p,)e^[/]. 



Note that conditioned on g = 1, ^^=i{2Xi — 2q)wi < and therefore Y17=i '^i^i ^ {p~ q)^- It 
fohows that 



/^ 



^WiYi g{Xi,...,Xn 



a=l 



On the other hand, 



/i 



^WiYi g{Xi,...,X„ 



.1=1 



> ip-q)Wf,[g] 

= {p-q)Wil-fi[f]). 

n 

= ^Wifi[YigiXi,...,Xn)] 
1=1 

n 
= ^W^Pi{l-p^)ef[f] 
1=1 

< p{l-p)6W. 



(6) 



(7) 



Combining © and 0, we get that 



r.i > . E7=iWiPi{l-Pi)e'^[f] 
^^■"^ - {p-q)W 



> 1 



'^P(l -P) 
p-q 



n 



3.2 Parts (b) and (c) of the theorem 

We note that part (c) of Theorem 11.21 follows immediately from part (b), because the only weighted 
majority function that is invariant under a transitive group, is simple majority. Let us nevertheless 
begin by giving an independent and simple proof of part (c). Note that if / is not the majority 
function then there is a vector (xi,X2, • • • ,x„) G {0, 1}" such that f{x) = and xi +X2 + - • • + x„ > 
n/2. Then we can simply take // to be uniform probability distribution on the orbit of x under T. 
It is then easy to see that /i[Xfc] > 1/2 for all k and that fi[f = 0] = 1. 

We now turn to the proof of Theorem ll.2l (b). We will show that if / is not a weighted majority 
function, then there exists a measure fi satisfying //[X^] > 1/2 for all k and /i[/ = 0] = 1. 

Define [n] = {l,2,...,n}. For 5 C [n] put xs = {xi,X2 ■ ■ ■ ,Xn) where Xi = 1 if and only if 
i G S. Let H he a hypergraph whose set of vertices is [n] and whose edges are subsets S of [n] 
such that f{xs) = 0. Let r* = t*[H) be the fractional cover number of H, i.e., the infimum over 
all v : {0,1}"' — > M of Yls&H^\-^s\^ under the condition that ^{xs) > for every S G H and 
'^Sgh kes ^[^s] — 1 fo^ ^11 ^- We get r* = oo if there are no i^ satisfying the two conditions above 
(note that this is the case if f{x) = xi, say). 

If T* < 2, then we can define n{S) = if f{S) = 1 and fi{S) = v{S)/t* when f{S) = 0. The 
probability measure // satisfies that 



Y, Kxs) > 1/r* > 1/2 



S:k€S,f{S)=0 



for every k and ;u[/ = 0] = 1 as stated in the theorem. Therefore, in order to prove part (b) of the 
theorem, it only remains to analyze the case r* > 2. 

A well known equivalent (by linear programming duality) definition of of r* is as the supremum 
of "^^^i Wi under the condition that Wk > for fc = 1, 2, . . . , n and X^{wj : z G 5} < 1 for every 
S eH. 

Assume first that r* > 2. In this case we can find Wj's such that Y^- Wi> 2 and /(xi, . . . , Xn) = 1 
if ^WiXi > 1. By slightly perturbing the Wi we may assume that for all x € {0, l}*^ it holds that 
^^ WiXi 7^ ^ Yli ^i ™ addition to the properties that ^^ Wi > 2 and f{xi, . . . , x„) = 1 if ^ WiXi > 1. 
Let g{x) = 1 if "^^WiXi > ^ "^{Wi and g{x) = if Yli'^i^i < h Yli'^i- Then g is anti-symmetric 
and / = =^ 9 = 0. It follows that f = g so that / is a weighted majority function as needed. 

The remaining case is where r* = ^ujj = 2. We obtain that f{xi, . . . ,Xn) = 1 if ^tfjXj > 1. 
Since / is anti-symmetric it follows that /(xi, . . . , x^) = if ^ WiXi < 1. The result follows. D 

4 Problems and an additional example 

The following problems naturally suggest themselves at this point: 

(1) For which class of distributions is it the case that for simple majority small voting power 

implies asymptotically complete aggregation of information? 

(2) For which class of distributions is it the case that for every monotone Boolean function small 

voting power implies asymptotically complete aggregation of information? 

(3) For which class of distributions is it the case that for every monotone Boolean function small 

individual effects implies asymptotically complete aggregation of information? 

A natural condition to impose on the distribution fi which is realistic in various economic 
situations is the FKG condition (see |H]). For x = (xi, . . . , x„) and y = (yi, . . . , y„), define 

max(x, y) = (max(xi, yi), . . . , max(x„, y„)) 

and 

min(x,y) = (min(xi,yi), . . . ,min(x„,y„)). 

One definition of FKG measure on {0, 1}" goes as follows: A distribution fi on {0, 1}" (or on M") 
is called an FKG measure if for every x, y S {0, 1}" we have 

fJ'{x)fi{y) < /i(max(x,y))/i(min(x,y)). 

The FKG property is a profound notion of non-negative correlations between agents' signals. It 
implies (but is strictly stronger than) the following condition (known as non-negative association, 
see j7]): For all increasing real functions / and g, it is the case that E[fg] > E[f]E[g]. This is 
equivalent to the condition that for all increasing events A and B we have that P[^i?] > P[A]P[i?]. 
Under the FKG property if the simple game is monotone, all effects are non-negative. This form of 
non-negative correlation is a plausible assumption to make in various contexts of collective choice. 
It is easy to see that under the condition of non-negative association all individual effects are 
non-negative. 



(4) For which class of monotone Boolean functions does small individual effects imply asymptoti- 
cally complete aggregation of information? 

In the following subsection, we present an example of an FKG measure and a monotone Boolean 
function such that the individual effects are small and yet there is no asymptotically complete 
aggregation of information. In this example both the voting scheme and the measure ^u are invariant 
under a transitive group of permutations. 

4.1 Example: FKG without aggregation 

The measure fj,. We start by describing the measure fi. The measure is given by a Gibbs measure 
for the Ising model on the 3-regular tree. See e.g. [H^J. The measure is defined as follows. Let 
Tr = {Vr,Er) be the r-level 3-regular tree. This is a rooted tree where each internal nodes has 
exactly 3 children and all the leaves are at distance exactly r from the root p. Let L^ be the set of 
leaves of that tree. Note that \Lr\ = 3^. Thus in Figure ^ the underlying tree is T2. 

We first define a measure u on the tree {0, 1} "■. In this measure the probability of x is given by 



i'[x] = :^ n iC^ - '^)h^u=x.} + ^Mx^^x^}) ■ 



2 

In words, this means that in the measure i/ the sign of the root Xp is chosen to be or 1 with 
probability 1/2. Then each vertex inherits its parents label with probability 9 = 1 — 2e and is 
chosen independently otherwise. 

Our measure /x is defined on {0, 1}^'' (so that the voters are the leaves of the tree) as follows. 



^"^ r 1 rl{i:a;i = l,3/i=0}| 



y:y\Lr<x 



In other words, a configuration of votes according to ^u may be obtained by drawing a configuration 
X according to v and looking at x\Lr- Then for each of the coordinates of i € L^ independently, 
the vote at x re-sampled to have the value 1 with probability 5. Below we will sometime abuse 
notation and write /i for the joint probability distribution of x and y. 

Standard results for the Ising model (see, e.g., |2j) imply that /i is an FKG measure. Moreover 
it is easy to see that the measure is invariant under a transitive group and that fj,[xi] = (1 + 5)/2 
for all i. 

The function m. The function m is given by the recursive majority function m = RM^^r- 
Clearly, m is monotone, anti-symmetric and invariant under a transitive group. 

Claim 4.1. If e = 6 < 0.01 then fi[m\ < 1/2 + 5/2 for m = RlVh^r and all r. 

Proof. The proof below is similar to arguments in [SJ|n]. Let (y^ : v (z Vr) he chosen according to 
the measure i'. Let (x„ : u € L^) be obtained from yy by re-sampling each of the coordinates of 
{yv : V € Lr) to 1 with probability 6. Let (m„ : v € Vr) denote the value of the recursive majority 
of all (xw : w € Lr{v)), where Lr(v) are all the leaves of T below v. We will show that fj.[m = rup = 

10 



0\yp = 0]>l-6. Since fi[yp = 0] = 1/2, we conclude that fi[m] < 1/2 + fj.[m\xp = 0]/2 < (l + 5)/2, 
as needed. 

We are interested in the probabihty that m„ = conditioned on y^ = 0. It is easy to see that 
this probabihty only depends on the height of v, i.e., on the distance between v and the set of 
leaves. We let p{k) denote the probability that my = conditioned on y^ = for a vertex v of 
height k. 

Clearly, p{0) = 1 — 5. We want to prove by induction that p{k) > 1 — 5 for all k. Let u be a node 
of height k + 1 and w a child of v. Note that conditioned on x^ = the probability that ttt,^ = 
is at least (1 — e)p{k) which is at least t = {1 — e){l — 5) by the induction hypothesis. Moreover, 
noting that the values of the majorities of the children of the node v are conditionally independent 
given that m„ = 0, we conclude that 

p{k) >t^ + 3*2(1 -t) = St^ - 2t^ = t2(3 - 2t). 

We need that t'^{3 - 2t) > 1 - 6 or recalling that e = 6: (1 - e)''(3 - 2(1 - e)^) > (1 - e). This in 
turn is equivalent to (1 - ef{3 - 2(1 - e)^) > 1. The function h{e) = (1 - ef{3 - 2(1 - e)^) has 
h'{e) = 10(1 — e)^ — 9(1 — e)^ = (1 — e)2(10(l — e)^ — 9). Therefore h is increasing in the interval 
[0, 0.01]. Since /i(0) = 1 it follows that h{e) > 1 for all e < 0.01 as needed. D 

Our next objective is to bound the effect of a voter at level r. We will prove the following: 

Claim 4.2. The measure /x on Tr and the function m = RM^^r satisfy that the effect of each voter 
is at most (1 - e/2)('-i)/2 + 2"^'-^^'^ . 

Proof. The argument here is similar in spirit to an argument in ^. Let t-\-s = r where t > (r — 1)/2 
and s > (r — l)/2. Fix a leaf voter i. We want to estimate /i[m, = l\yi = 1] — fi[m = l\yi = 0]. Let's 
denote by /_io the measure /u conditioned on yi = and by fii the measure fi conditioned on t/j = 1. 

Let i = vq,vi, . . . ,Vr = p denote the path from i to the root. We first claim that the mea- 
sures /iO)/^i and fj, may be coupled in such a way that except with probability (1 — 2e)* the only 
disagreements between ^0)/"i and fi are on vertices below vt. 

The follows immediately from the random cluster representation of the model. In this represen- 
tation we declare and edge {u,v) open with probability (1 — 2e) and closed with probability 2e. If 
the edge {u, v) is open then yu = Vv, otherwise, the two labels are independent. It is then clear that 
we may couple the two measure /xQ) /^i and /x below vt as long as the path from i to vt contains at 
least one closed edge. The probability that such an edge does not exist is at most (1 — 2e)*. The 
proof of the first claim follows. 

For each j denote by Uj and Wj the siblings of Vj. We assume that the measures po, fj-i and 
H are coupled in such a way that the only disagreements between them are on vertices below vt. 
Note that if this is the case, then if the values of m under /xq and pi are different then for all 
r > j > t it holds that niu 7^ rriyi],. We wish to bound the fi probability that tHu 7^ rnw for 
r > j > t. We will bounds this probability conditioned on the values {yvYj=f Conditioned on 
iVv Yj=t ^^^ event rriu. ^ ruyj. are independent for different j's. Moreover, by the Markov property, 
p[mui 7^ fT^wA{yvh)h=t] = Mb"«, / "x^„,|yj;,_i]- Finally note that conditioned on y^ the random 
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variables ruu imw are identically distributed and independent. Therefore 

yu[m„. / m^ ly^ ] < max 2p{l - p) < 1/2. 

PG[0,1] 

We thus obtain that the jj, probability that mu^ 7^ rriw. for r > j > t is at most 2~^. D 

5 A linear programming bound 

We present in this section an alternative proof of Theorem 11.21 (a) using linear programing. This 
approach yields tight bounds, stated in the following lemma. 

Lemma 5.1. Let {wi)"^^-^ be positive weights, let < q < 1, and let f : {0,1}" -^ {0,1} be a 
function satisfying 

1 ^/Er=i(2^i-29H>0 



^ I tfE1=i{'^Xi-2q)wi<0. 
Write W = Y17=i '^i- Suppose that p > q and that ^ is a probability measure satisfying 

n 
Y,I^[X^]=PW (8) 



i=l 



and 



If 5 >^^, then 



whereas otherwise 



Y^WiPiil-p,)e';[f] < 6Wpil-p). (9) 



i=l 



ml > 



fJ'if] > max < 6p, 1 



1-g 

Sp{l - p) 



p-q 



These bounds are tight. 

(Note that the conditions of Theorem Ol (a) imply (jH)) and ©.) 

Proof. We will first make the necessary computations for the case of simple majority. Let / : 
{0, l}'^ — > [0, 1] be a symmetric monotone function. Let 5i{x) = 1 if Xj = 1 and 5i{x) = otherwise. 
Let rii{x) = 1 — p \i Xi = 1 and rji{x) = —p if Xi = 0. 

We want to minimize 

/i[/(Xi,...,X„)]=J^/i(x)/(x). (10) 

X 

under the restrictions that 

n 

^ 6i{x)^l{x) = ^ ix[Xi\ = np, (11) 

i,x j=l 
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and (letting Y-i = Xi — p) 

n n 

5^/i[yj(Xi, ...,Xn)]=Y^ Cov^[/(Xi, . . .,Xn),X.i] < 5p{l-p)n, 

i=l i=l 

which gives 

n 

^ r]i{x)fi{x)f{x) < n6p{l -p). (12) 

x,i=l 

The constraints (|11() and p2p and the cost (jlOj) are invariant under the action of Sn on the 
coordinates of x (since f{x) has this invariance property). It follows that there exists a minimizer 
/i which is symmetric, i.e., 

^l{x) 



«|a;| 



\\x\J 

where a is a positive function. 

Since / is a majority function, it follows that there exists an r such that f{x) = /(|x|) = 1 if 
and only if |x| > r. We therefore obtain the following optimization problem. Write q = r/n and 
q' = {r -\- l)/n. We assume below that p > q'. 



We want to minimize 



under the restrictions 



Yl «- (13) 



j=r+l 



5^«i-=P, (14) 



n 
1=0 



and 



X] "« P Y ai<Sp{l-p). (15) 

i=r+l i=r+l 

It is useful to introduce A = "^^^^.^i en and B = Yl'i=r+i ^«n- Similarly, we write A' = Yl'i=o ^i 
and B' = Ya=o «4- ^°*^ *h^* ^' ^'' ^' ^' ^^^ ^^^ positive, < S' < qA' , similarly, q'A< B < A. 
Note that A + A' = 1, (|lljl may be written as B + B' = p and (|12p may be written as i? — pyl < 
(5p(l-p). 

Moreover, it is easy to see that any A,A',B,B' which satisfy the above equations give rise to 
ai satisfying the constraints. Using A + A' = 1 and B + B' = p we thus led to the following 
optimization problem in A and B: Minimize A under the constraints 

0<v4<l, (16) 

q'A<B < A, (17) 

0<{p-B)<q{l-A), (18) 
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B-pA<Sp{l-p). (19) 

In other words, we are looking for the minimal A satisfying 

max{5, f-6{l-p)}<A< min{f , 1 - f + f }, < B < p. 

From the assumption p > q' it follows that for < B < p, the minimum on the right hand side is 
obtained by 1 — - H — . We may thus simplify: 

max{B,f -5(l-p)}<^<l-f + f, 0<B<p. 

The two functions bounding A from below are increasing in B. Therefore in order to minimize A 
we should minimize B. Note that B > — — 6{1 — p) if and only if B < Be = Sp. 

Suppose that B < Be- Then we obtain that i?<l — | + — , or equivalently, B > j^. We thus 
conclude that if j^ < 6p, then the minimum for B (and therefore for A) is obtained at y^^. 

If y5^ > 6p, then B > Be- Moreover, in this case, 6{1 — p) < 1 — - H — , which is equivalent 

to 

dpq{l - p) 

B > p . 

p-q 

Therefore the minimal B in this case is given by 

r 5pq{l - p) 
B = max \op,p 

and therefore by the bound A> (5(1 — p) the minimal A is given by 

I p-q 

This establishes the lemma for the special case of simple majority. Moving from simple majority to 
weighted majority is easy. First note that we can assume that all weights are nonnegative integers. 
Replace a variable Xk with Wk copies. We will thus consider {0, 1}^ where W = wi + W2 + ■ ■ ■ Wn- 
Consider the distribution v' on {0, 1}^ induced from u with the requirement that with probability 
1 all copies of an original variable have the same value. The desired result for the weighted majority 
on {0, 1}" follows from the case of simple majority on {0, 1} . D 
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